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I*  Summary . 

The  increase  in  Importance  of  software  in  command  and 
control  and  other  complex  systems  has  not  been  accompanied 
by  commensurate  progress  in  the  development  of  analytical 
techniques  for  the  measurement  of  software  quality  and  the 
prediction  of  software  reliability.  In  recognition  of  the 
disparity,  the  Computer  Sciences  Department  of  the  Naval 
Electronics  Laboratory  Center,  San  Diego,  California  is 
sponsoring  this  software  quality  control  and  reliability 
research  project. 

The  objectives  of  the  research  are  to  develop  procedures 
for  controlling  software  quality  and  to  develop  a  methodology 
for  predicting  software  reliability.  The  data  which  were 
employed  were  Navel  Tactical  Data  System  (NTDS)  Trouble 
Reports  and  supporting  documentation. 

In  order  to  accomplish  these  objectives,  it  was  necessary 
to  perform  many  statistical  analyses  of  NTDS  test  data.  The 
major  analyses  are  listed  below. 

.  analysis  of  the  number  of  software  troubles  per  unit  time* 
as  a  function  of  cumulative  test  time2; 

.  analysis  of  the  distribution  of  time  between  troubles-^  and 
the  distribution  of  number  of  troubles  per  unit  time; 

.  goodness  of  fit  tests  for  identifying  theoretical  relia¬ 
bility  functions  which  might  be  appropriate  for  reliabil¬ 
ity  prediction; 

.  estimation  of  reliability  function  parameters; 

1 Number  of  troubles  per  unit  time  is  the  number  of  troubles 
occurring  in  a  program  test  computer  time  interval  divided  by 
the  time  interval. 

2Curaulative  test  time  is  the  total  computer  time  used  to 
date  in  testing  a  single  program. 

^Time  between  troubles  is  the  cumulative  computer  time  between 
two  consecutive  troubles.  The  two  consecutive  troubles  may 
occur  in  the  same  or  different  test  runs. 
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.  estimation  of  confidence  limits  for  reliability  function 
parameters  and  reliability  functions i 

.  analysis  of  program  reliability  variability  between  and 
among  programs;  and 

.  development  of  equations  for  estimating  reliability  and 
test  requirements. 

With  respect  to  NTDS  programs,  the  research  to  date 
suggests  these  conclusions} 

(1)  software  reliability  prediction  is  feasible  but  that 
much  more  analysis  is  required  in  order  to  validate  the 
approaches  which  have  been  developed;  (2)  there  is  greater 
variability  in  program  reliability  between  programs  than 
there  is  within  programs;  (3)  in  general,  the  occurrence 
of  software  troubles  is  not  a  stationary  process;  and 
(4)  there  is  no  single  probability  distribution  which 

typifies  the  occurrence  of  software  troubles.  ' 

II.  Objectives  and  Approach. 

One  objective  was  to  determine  the  feasibility  of 
predicting  software  reliability  based  on  the  use  of  program  test 
results.  A  second  objective  was  to  identify  quantitative 
measures  of  program  quality  which  could  be  used  in  software 
quality  control.  A  third  objective  was  to  investigate 
methods  for  est imatlng  the  amount  of  test  time  which  is 

required  in  order  to  satisfy  program  reliability  requirements.  < 

Test  time  estimates  are  needed  at  two  stages:  (1)  prior  to 
the  commencement  of  program  testing  when,  based  on  relia¬ 
bility  requirements,  it  is  necessary  to  make  an  initial  < 
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estimate  of  required  test  time  and  (2)  during  testing  when, 
based  on  reliability  requirements  and  the  reliability 

achieved  to  date,  it  is  necessary  to  make  an  estimate  of  the 
required  remaining  test  time. 

Two  other  areas  of  investigation  involved  the  analysis 
of  sources  of  program  reliability  variation  and  the  identi¬ 
fication  of  the  appropriate  program  sampling  unit  to  use 
for  reliability  analysis. 

Software  trouble  reports  were  associated  with  scheduled 
test  time  in  order  to  obtain  distributions  of  time  between 
troubles.  The  most  important  distribution,  from  the  stand¬ 
point  of  reliability  prediction,  is  time  between  troubles, 
t  ,  . 

since  Q( t)  *  f  f(t)dt  and  R(t)  *  1  -  Q(t),  where  f(t)  is  the 

~b 

density  function  of  time  between  troubles,  t  is  program 
operating  time,  Q(t)  is  unreliability  and  R(t)  is  reliability. 
Thus,  if  a  theoretical  density  function  f(t)  can  be  fitted 
to  the  empirical  relative  frequency  distribution,  an  estimate 
of  the  reliability  function  can  be  obtained.  If  no  theoret¬ 
ical  density  function  is  suitable,  the  empirical  relative 
frequency  distribution  of  time  between  troubles  can  be  summed 
to  obtain  an  estimate  of  the  unreliability  function  from 
which  an  estimate  of  the  reliability  function  can  be  obtained. 
Either  a  theoretical  or  empirical  reliability  function  can  be 
used  to  predict  the  reliability  of  a  program  for  various 
program  operating  times.  However,  if  a  theoratical  reliabil¬ 
ity  function  can  be  used,  confidence  interval  estimates  can 
be  obtained  for  the  theoretical  reliability  function  parameters. 
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This  Is  Important  because  it  is  then  possible  to  estimate 
the  reliability  function  parameters  which  are  necessary  to 
achieve  stated  reliability  objectives.  With  parameter  esti¬ 
mates  available ,  it  is  also  possible  to  estimate  the  amount 
of  test  time,  as  a  function  of  number  of  troubles,  which  will 
be  required  in  order  to  achieve  reliability  objectives. 

It  is  possible  to  employ  distribution-free  methods  and 
empirical  reliability  functions  to  estimate  confidence 
limits  for  the  reliability  function.  This  approach  provides 
the  capability  of  comparing  the  desired  reliability  function 
(reliability  objective)  with  the  empirical  reliability  func¬ 
tion  and  its  confidence  limits,  but,  with  no  theoretical 
reliability  function  available,  there  is  no  capability  for 
making  those  parameter  estimates  which  are  of  interest  in 
reliability  analysis. 

In  order  to  idontify  the  major  contributors  of  program 
reliability  variability,  an  Analysis  of  Variance  test  was 
employed.  Additionally,  goodness  of  fit  tests  were  conducted 
for  various  relative  frequency  distributions  of  time  between 
troubles  and  program  run  time  in  order  to  identify  the  type 
of  distribution  which  is  applicable  to  software  failures. 

A  problem  of  sampling  arises  due  to  the  possible  non¬ 
randomness  of  sample  selection.  In  the  case  of  program 
testing,  randomness  would  mean  that  each  part  of  a  program 
has  an  equal  probability  of  being  tested.  However,  in 
practice,  samples  are  not  "drawn*  in  the  usual  sense;  rather, 
inputs  and  program  segments  are  selected  for  testing  based 
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on  the  criticality  of  the  segment  to  mission  success,  or  for 
some  less  objective  reason.  Some  program  segments  will  be 
more  intensively  tested  than  others.  Whatever  the  criteria 
employed  for  program  testing,  it  is  clear  that  the  various 
program  segments  do  not  have  equal  probability  of  selection. 

Ill*  Data  Analysis. 

A.  Trouble  Bates  and  Program  Run  Time 

The  first  analysis  which  was  performed  was  to 
examine  the  pattern  of  trouble  rates  as  a  function  of  cumu¬ 
lative  test  time.  This  was  done  in  order  to  ascertain 
whether  trouble  rates  decrease  and  eventually  stabilize  or 
whether  they  continue  to  fluctuate  as  testing  continues.  The 
achievement  of  an  approximately  constant  trouble  rate,  after 
a  period  of  testing  has  elapsed,  would  Indicate  that  the 
occurrence  of  troubles  has  stabilized  and  that  the  major 
troubles  have  been  identified  and  corrected. 

Two  types  of  trouble  rates  were  analyzed.  One  is 
a  weekly  trouble  rate  and  the  second  is  a  cumulative  trouble 
rate.  The  first  rate  is  the  number  of  troubles  detected 
during  a  week  divided  by  the  amount  of  computer  test  time 
expended  during  the  week.  This  rate  provides  an  indication 
of  short  term  fluctuations  the  rate  of  detecting  software 
troubles.  The  seoond  rate  is  the  total  number  of  troubles 
which  have  occurred  slnoe  testing  began  divided  by  the  total 
elapsed  test  time.  This  rate  provides  an  Indication  of  whether 
the  long  term  trouble  rate  is  decreasing,  constant  or 
increasing.  A  decreasing  rate  would  indicate  that  program 


7 


reliability  increases  with  increases  in  testing* 

Another  random  variable  which  was  analyzed  is  pro¬ 
gram  run  time.  Program  run  time  is  the  elapsed  computer 
time  from  start  of  program  to  the  occurrence  of  a  trouble. 

Hence,  the  random  variable  program  run  time  only  applies 
when  a  trouble  occurs  during  a  test.  Program  run  time  was 
used  in  Analysis  of  Variance  tests  for  estimating  the  relative 
contributions  to  variations  in  program  reliability  due  to 
differences  between  and  among  programs. 

An  analysis  of  program  run  time  can  also  be  used  to 
indicate  whether  program  reliability  Increases  as  testing  con¬ 
tinues.  As  testing  progresses,  we  would  expect  to  observe  a 
gradual  increase  in  program  run  time  and  an  eventual  stabili¬ 
zation  around  a  mean  value. 

All  statistical  estimates  presented  in  this  report 
are  based  on  total  number  of  troublos.  In  NTDS  testing, 
troubles  are  classified  according  to  High  (software  unuseable) 
Medium  (major  limitation)  and  Low  (minor  limitation)  severity 
categories.  The  trouble  reports  were  not  segregated  by  category 
of  trouble  because  the  initial  Interest  was  to  obtain  an  over¬ 
all  picture  of  trouble  occuri'e^r  a  distributions;  secondly, 
sample  sizes  are  considerably  reduced  if  troubles  are  analyzed 
by  categories. 

Data  concerning  the  occurrence  of  troubles  and  mean 
program  run  time  for  several  programs  of  Ship  1  are  shown  in 
Figure  1  and  Figure  2.  The  trouble  rate  shown  in  Figure  1  is 
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the  weekly  rate;  the  trouble  rate  shown  In  Figure  2  is  the 
cumulative  trouble  rate.  Also  plotted  in  Figure  1  is  mean 
program  run  time,  computed  on  a  weekly  basis.  These  data 
suggest  that  trouble  rates  decrease  with  increased  testing. 

It  appears  that  fluctuations  occur  but  with  decreasing  ampli¬ 
tude  as  testing  continues.  Subsequent  analysis  on  many  pro¬ 
gram  modules  have  verified  this  decreasing  oscillatory  behavior. 
The  data  concerning  program  run  time  are  inconclusive. 

In  summary,  the  data  presented  here  and  subsequent 

1 

analysis  indicate  that  trouble  rates  decrease  and  stabilize 
and  that  time  between  troubles  Increase  and  stabilize  with 


continued  testing. 

B.  Distribution  of  Time  Between  Troubles  and  the 
Reliability  Function 

Integration  with  respect  to  time,  of  the  probability 
density  function  of  time  between  troubles  yields  the  unrelia¬ 
bility  function  from  which  the  reliability  function  can  be 
obtained.  The  reliability  function  is  used  to  predict  the 
reliability  of  a  program  for  various  operating  times.  In 
order  to  estimate  the  reliability  function,  it  is  first  nec¬ 
essary  to  obtain  the  empirical  distribution  of  time  between 
troubles  from  a  sample  of  trouble  report  data.  Then,  para¬ 
meter  estimates  are  obtained  from  the  sample  data,  and  a 
goodness  of  fit  test  is  made  in  an  attempt  to  identify  an 
appropriate  theoretical  reliability  function.  In  addition  to 
its  use  as  a  reliability  predictor,  the  reliability  function 


can  be  employed  to  estimate  additional  testing  requirements 

1 Based  on  analysis  which  have  been  performed  subsequent  to 
the  period  covered  by  this  report. 
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whenever  predicted  reliability  is  less  than  specified  relia¬ 
bility. 

Goodness  of  fit  tests  were  conducted  for  Program  1, 
Ship  1  to  determine  whether  typical  reliability  functions, 
such  as  the  normal  or  exponential,  would  be  appropriate  for 
predicting  program  reliability.  The  normal  test  was  of 
interest  to  ascertain  whether  software  has  an  increasing 
hazard  rate,  i.e.,  conditional  trouble  rate  increases  with 
operating  time.  The  exponential  test  was  of  Interest  to 
ascertain  whether  software  exhibits  a  constant  hazard  rate. 

The  results  of  the  Chi  Square  test  for  normality  are 
given  in  Table  A-l  of  the  Appendix.  The  hypothesis  that  the 
distribution  is  normal  is  rejected  at  the  .005  level  of  sig¬ 
nificance.  Thus,  the  normal  reliability  function  appears 
inappropriate  for  this  program.  A  goodness  of  fit  test 
against  the  exponential  distribution  which  used  the  Kolmogorov- 
Smirnov  (K-S)  method  is  shown  in  Table  A-4  of  the  Appendix. 
Individual  time  between  trouble  values  were  not  available. 

It  was  neoessary  to  estimate  time  between  trouble  by  dividing 
the  number  of  troubles  occurring  during  a  time  interval  by 
that  Interval.  Thirty- three  Trouble  Reports  provided 
10  time  between  troubles  values.  For  this  small  sample,  the 
hypothesis  of  an  exponential  distribution  was  accepted  at  the 
.05  level  of  significance.  The  theoretical  exponential  relia¬ 
bility  and  empirical  reliability  functions  are  shown  in 
Figure  3. 

The  fact  that  this  particular  program  passed  a 
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goodness  of  fit  test  for  the  exponential  does  not  mean  that 
programs,  in  general,  have  this  distribution.  Subsequent 
analysis  indicates  that  over  a  sufficiently  long  operating 
period,  the  distribution  of  time  between  troubles  and  the 
distribution  of  number  of  troubles  occurring  in  a  given  time 
Interval  are  not  stationary  processes.  The  mean  time  between 
troubles  decreases  significantly  with  increased  program 
testing,  although  the  fora  of  the  distribution  in  later  test 
periods  may  be  the  same  as  in  earlier  periods. 

Another  random  variable  which  may  be  used  to  provide 
a  limited  form  of  reliability  prediction  is  program  run  time. 
The  complement  of  the  distribution  function  of  program  run 
time  is  the  conditional  probability  of  a  program  operating 
successfully  for  t  hours,  given  that  trouble  will  occur 
during  the  run.  This  Interpretation  is  used  because  program 
run  time  is  the  time  to  failure  for  programs  which  fail. 
Although  this  probability  is  not  equivalent  to  reliability, 
it  is  a  useful  measure  of  reliability  because  the  probability 
of  surviving  during  the  required  operating  time,  for  programs 
which  fail,  can  be  estimated.  In  addition,  program  run  time 
can  be  readily  obtained  for  NTDS  programs,  whereas  only 
approximate  values  of  the  time  between  troubles  variable  can 
be  obtained  by  laborious  methods. 

C.  Analysis  of  Homogeneity  of  Bellablllty 

Distribution  Among  Programs 

A  major  objective  of  this  research  is  the  deter¬ 
mination  of  whether  the  various  NTDS  programs  have  the  same 
or  different  distributions.  If  all  or  many  of  the  programs 
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have  the  sane  type  of  distribution,  it  Bight  be  possible  to 
develop  a  general  model  of  software  reliability  which  would 
be  valid  for  a  large  number  of  programs.  Conversely,  if 
there  is  considerable  variety  in  type  of  distribution,  it 
would  be  necessary  to  identify  the  type  of  distribution 
which  is  applicable  to  a  program  in  order  to  obtain  its  reli¬ 
ability  function. 

Since  the  time  between  troubles  variable  was 
difficult  to  obtain  in  large  quantities  from  available  data, 
whereas  the  program  run  time  variable  was  available  for  many 
programs,  the  latter  was  utilized  for  this  analysis.  Al¬ 
though  the  reliability  function  cannot  be  derived  from  pro¬ 
gram  run  time,  this  variable  is  a  measure  of  program  relia¬ 
bility.  Hence,  an  analysis  of  program  run  time  distributions 
for  various  programs  provides  an  indication  of  whether  pro¬ 
grams  have  similar  or  dissimilar  reliability  characteristics. 

In  order  to  estimate  the  consistency,  if  any,  of 
type  of  distribution  among  programs,  the  following  program- 
ahip  combinations  were  analyzed: 

•  Program  1,  Ship  1 

•  Program  1,  Ship  2 

•  Program  1,  Ships  1-7 

•  Program  5*  Ship  8 

•  Ten-Pro gram- Ship  Pairs  Combination 

The  ship  designation  refers  to  a  ship  mock-up  for  on-shore 
program  testing  and  does  not  refer  to  data  collected  from  a 
ship  at  sea*  The  type  of  goodness  of  fit  test  that  was  made 
was  based  on  the  shape  of  the  program^  histogram  (exponential 
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tests  were  made  for  programs  with  an  exponentially  shaped 
histogram)*  Por  some  programs,  goodness  of  fit  tests  were 
made  against  two  different  distributions,  when  it  was  conven¬ 
ient  to  do  so*  The  results  of  the  tests  are  listed  in  Table  1* 
The  figures  are  located  in  the  Appendix* 

These  results  suggest  that  there  is  a  lack  of  homo¬ 
geneity  of  type  of  program  run  time  distribution.  Currently, 
NTDS  modules,  rather  than  programs,  are  undergoing  analysis. 
Since  a  program  consists  of  several  modules,  a  program  may  be 
too  large  and  complex  a  unit  to  use  for  reliability  analysis 
due  to  the  Interactions  among  modules*  Modules  appear  to  be 
more  suitable  for  analysis  because  each  module  performs  a 
specific  function  and  nodule  coding  has  been  somewhat  stan¬ 
dardized*  Also,  due  to  differences  in  software  interface 
requirements  among  ships,  ship  operating  requirements  and 
computer  configurations  have  an  affect  on  software  reliability. 
Since  the  various  nodules  are  used  on  many  different  ships, 
the  effect  of  ship  environment  on  software  reliability  would 
tend  to  be  minimized  when  reliability  is  analyzed  by  module. 

D.  AB«1t.1«  of  Variance. 

A  second  and  more  rigorous  test  for  determining 
whether  significant  differences  in  reliability  characteristics 
exist  among  programs  was  performed  using  Analysis  of  Variance 
( AO V ) •  This  test  Involves  the  hypothesis  of  equality  of  means 
among  several  populations.  Program  run  time  was  used  in  this 
test.  A  single  category  test  was  used.  The  single  category 
of  classification  was  program/ship  combination.  It  was  of 
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Interest  to  learn  whether  mean  program  run  times  differ  for 
various  program-ship  combinations*  It  would  have  also  been 
possible  to  perform  a  two  category  analysis  -  program  and  ship; 
however,  the  primary  interest  at  this  stage  of  the  analysis 
was  to  compart-  program  run  time  means  for  pro  gram- ship  combi¬ 
nations.  Twenty-eight  programs  were  used  in  one  AOV  test. 

Some  departure  fro*  the  assumptions  of  an  AOV  test  are  present, 
because  program  run  time  is  not  normally  distributed  for  all  28 
programs.  There  is  also  some  departure  from  the  assumption  of 
equal  variances.  The  results  of  the  test  are  given  in  Table 
A— 3  in  the  Appendix.  The  hypothesis  that  all  program  run  time 
means  are  equal  is  rejected  at  the  .05  level  of  significance* 

A  second  AOV  test  was  conducted  using  only  Program  1 
for  Ships  1-7.  In  this  case,  the  test  Involves  equality  of 
program  run  time  means  for  the  same  program  used  on  seven 
different  ships.  In  this  case,  the  category  of  classification 
is  Ship.  Here,  there  is  also  some  departure  from  the  assump¬ 
tions  of  the  AOV  test.  The  results  of  this  teat  are  summar¬ 
ized  in  Table  A-5«  The  hypothesis  of  equality  of  program  run 
time  means  is  rejected  at  the  .05  level  of  significance.  Since 
this  test  was  only  conducted  for  one  program,  the  result  does 
not  mean  that,  in  general,  the  ship  environment  effect  is 
significant.  A  two  oategory  (program  and  ship)  AOV  would 
provide  better  information  about  the  effects  of  program  and 
ship. 

Thus,  both  tests,  one  involving  28  programs  and  many 

1A  logarithmic  transformation  might  be  appropriate  in  order  to 
normalize  the  data  for  programs  which  have  skewed  distributions 
that  are  approximately  normal.  If  the  transformation  did  re¬ 
sult  in  normalization,  the  assumptions  of  the  AOV  test  would 
be  better  fulfilled. 
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ships  and  the  other  involving  the  same  program  for  seven  ships, 
indicate  that  the  programs  are  heterogeneous  with  respect  to 
reliability  characteristics. 

Although  the  AOV  and  the  goodness  of  fit  tests 
(described  in  the  previous  section)  are  not  exhaustive,  the 
results  suggest  that  program  reliability  characteristics  are 
heterogeneous  and  that  program  reliability  and  quality  control 
may  have  to  be  dealt  with  on  an  individual  program  basis. 
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IV.  Reliability  Prediction 

One  approach  to  software  reliability  prediction  is 
to  identify  a  theoretical  reliability  function  which  repre¬ 
sents  9  good  fit  to  the  empirical  data.  This  approach  would 
be  accoaplished  by  using  the  following  sequence} 

.  Tentative  selection  of  reliability  function  based  on  shape 
of  frequency  function  of  empirical  data 
.  Estimation  of  reliability  function  parameters 
.  Identification  of  reliability  function  by  using  goodness 
of  fit  tests 

.  Estimation  of  reliability  function  parameters  confidence 
limit 

.  Estimation  of  reliability  function  confidence  limit 

•  Prediction  of  reliability  and  its  confidence  limit  for 
various  intended  operating  times 

•  Comparison  of  required  reliability  with  predicted  relia¬ 
bility 

The  implementation  of  the  above  sequence  is  compli¬ 
cated  by  the  fact  that  the  time  between  troubles  or  number  of 
troubles  per  fixed  time  interval  is  not  a  stationary  process 
with  respect  to  test  time.  As  a  result  of  a  reduction  in  the 
trouble  rate  as  testing  continues,  the  form  of  the  distribu¬ 
tion  may  remain  the  same  over  time  but  parameter  values  may 
change,  or  the  actual  form  of  the  distribution  may  change. 
This  means  that  a  reliability  function  which  is  based  on  the 
total  number  of  data  points  collected  over  the  entire  test 
time  may  not  be  an  accurate  predictor,  because  the  data  set 
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is  non-representative  of  the  current  state  of  the  error 
occurrence  piocess.  If  the  fora  of  the  distribution  remains 
the  same  throughout  the  test  period  and  peraaeters  change, 
indicating  an  improvement  in  program  quality  as  testing  con¬ 
tinues,  a  smoothing  technique  could  be  applied  to  the  most 
recent  data  points  in  order  to  obtain  parameter  estimates 
that  would  apply  to  the  next  time  increment.  The  parameter 
estimate  mould  be  updated  as  testing  continues.  If  the  form 
of  the  distribution  changes  with  test  time,  the  problem  is 
much  more  complex  and  requires  the  identification  of  the 
distribution  which  is  most  appropriate  for  each  stage  of 
testing  and  operation.  Unfortunately,  sample  size  may  be 
drastically  reduced  when  the  currency  of  data  points  is 
improved  by  eliminating  out-of-date  values. 

The  following  Indicates  a  procedure  which  would  be 
employed  for  reliability  prediction,  once  an  appropriate 
reliability  function  is  obtained.  The  fact  that  the  specl- 
floa  of  this  procedure  are  based  on  the  exponential  relia¬ 
bility  function  does  not  mean  that  the  exponential  distribu¬ 
tion  can  be  applied  to  all  programs.  In  addition,  although 
the  specifics  of  the  example  are  based  on  the  exponential 
distribution,  the  gene red  procedure  would  be  applicable  to 
other  distributions. 

It  was  shown  earlier  that  for  Program  1,  Ship  1,  an 
exponential  reliability  function  could  be  used.  Although  the 
fit  is  not  strong,  it  will  be  assumed  that  the  exponential 
applies  in  order  to  illustrate  the  procedure.  The  calculation 
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of  the  loner  confidence  Unit  for  the  MTBT  and  the  relia¬ 
bility  function  is  shown  in  Section  B  of  the  Appendix.  The 
procedure  consists  of  estimating  the  lower  confidence  limit 
of  the  MTBT  and  using  this  value  in  tho  exponential  relia¬ 
bility  function  to  obtain  the  reliability  lower  confidence 
limit.  The  exponential  reliability  function  and  its  95  per 
cent  lower  confidence  limit  are  shown  in  Figure  4.  The 
sample  MTBT  which  was  obtained  is  2.94  hours;  the  95  per  cent 
lower  confidence  li»lt  of  MTBT  is  2.2?  hours.  Exponential 
reliability  is  therefore  R  *  e'*34t  and  the  lower  limit  is 
S  .  e-44*- 

The  reliability  function  performs  two  functions: 

(1)  it  is  the  means  of  reliability  prediction  and  (2)  the 
lower  confidence  limit  can  be  compared  with  the  required 
reliability  function  for  determining  whether  reliability 
requirements  are  satisfied.  If  this  is  not  the  case,  the 
required  reliability,  MTBT,  test  time  and  allowable  number 
of  troubles  can  be  estimated.  Two  examples  of  this  procedure, 
using  the  assumed  reliability  objective  shown  in  Figure  4, 
are  given  in  Section  C  of  the  Appendix.  Both  examples  per¬ 
tain  to  a  situation  in  which  it  is  necessary  to  estimate 
remaining  test  requirements  after  testing  is  under  way.  One 
example  pertains  to  incurring  zero  future  troubles  and  the 
other  pertains  to  incurring  10  future  troubles.  Additional 
test  time,  MTBT,  reliability  and  lower  reliability  limit 
requirements  are  estimated  for  the  two  cases  and  arr  sum¬ 
marized  in  Section  C.  The  reliability  function  which  would 
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Reliability  Function  and  Its  Confidence  Limit 
for  Program  I,  Ship  I  Using  Exponential 
Reliability  Function. 

a  =  .05  Level  of  Significance 
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be  required  in  order  to  satisfy  the  reliability  objective  Is 
shown  In  Figure  4.  It  is  seen  that  the  lower  limit  of  this 
reliability  function  is  greater  than  or  equal  to  the  relia¬ 
bility  objective  at  all  points  during  the  operating  time  of 
the  program.  Thue,  the  original  reliability  function  para¬ 
meter  estimate,  pertaining  to  test  results  achieved  to  date, 
is  used  in  conjunction  with  the  reliabil'ty  objective  to 
estimate  the  remaining  test  performance  requirements.  A  re¬ 
vised  reliability  function  which  will  satisfy  the  reliability 
objective  is  also  estimated.  An  Interesting  result  of  this 
analysis  is  that  the  10  trouble  situation  requires  more  test 
time  but  lower  HTBT  and  reliability  than  the  zero  trouble 
situation,  for  a  given  lower  reliability  limit.  This  is  due 
to  the  narrower  confidence  band  which  is  possible  with  a 
larger  sample  size  (greater  number  of  troubles).  The  exam¬ 
ples  Illustrate  that  the  reliability  function  can  also  be 
used  for  program  quality  control  by  providing  a  means  for 
estimating  the  t#et  performance  whloh  is  necessary  to  satisfy 
reliability  apeclf lcatlons. 

In  the  example,  if  the  program  is  tested  for  an  addi¬ 
tional  1883  hours  and  no  txoublss  occur,  the  required  relia¬ 
bility  ia  demonstrated.  If  one  trouble  occurs  before  the 
expiration  of  1883  hours,  an  amount  of  time  in  addition  to 
1883  houra  will  be  required  to  demonstrate  reliability. 

Tha  amount  of  additional  test  time  (1980  hours), 
corresponding  to  10  future  troubles,  would  apply  to  the  situ¬ 
ation  In  which  reliability  cannot  be  demonstrated  prior  to 
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the  occurrence  of  the  tenth  trouble.  If  an  additional  teat 
time  equal  to  1980  hours  has  expired  and  no  more  than  10 
troubles  has  occurred,  reliability  would  be  demonstrated* 

The  estimation  of  future  test  requirements  is  an 
iterative  process.  At  the  termination  of  each  test  stage, 
future  test  requirements  are  estimated  on  the  basis  of  test 
experience  to  date  and  required  reliability.  Test  require¬ 
ments  for  each  stage  are  specified  in  terms  of  paired  values 
of  number  of  troubles  and  amount  of  test  time.  The  pair 
which  will  apply  depends  upon  the  software  trouble  experi¬ 
enced  during  the  next  stage.  Onee  reliability  has  been 
demonstrated,  testing  is  discontinued  and  the  predicted 
reliability  function  of  the  final  stage  become  the  reliabil¬ 
ity  function  for  operational  use.  Updating  of  the  reliabil¬ 
ity  function  would  be  continued  during  the  operational  phase 
as  additional  data  on  software  troubles  is  obtained. 

As  indicated  previously,  the  reliability  function 
which  applies  during  one  test  stage  may  not  apply  during  a 
subsequent  test  stage.  As  testing  proceeds  and  additional 
troubles  occur,  the  type  of  reliability  distribution  or  its 
parameters  are  revised.  The  revised  function  is  used  to 
obtain  the  reliability  prediction  for  the  next  stage.  At 
the  conclusion  of  each  test  stage,  it  is  assumed  that  the 
revised  reliability  function,  with  parameter  estimate  up¬ 
dated  for  the  next  stage,  is  applicable  to  the  next  test 
stage.  Once  additional  data  are  obtained  from  the  next 
stage,  estimates  of  reliability  and  test  requirements  ai 
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revised  as  necessary*  In  the  example,  it  was  assumed  that 
the  exponential  distribution  was  applicable  to  the  next  stage. 
However,  the  parameter  estimate  was  revised  in  accordance 
with  assumed  values  of  number  of  troubles  occuring  in  the 
next  test  stage. 

In  some  cases,  a  change  in  the  type  of  reliability 
function  is  made  at  the  termination  of  a  test  stage,  if  a 
significant  change  occurs  in  the  distribution  of  time  between 
troubles. 

Equations  for  estimating  the  amount  of  test  time 
required  in  order  to  achieve  a  reliability  objective  are 
formulated  in  Section  D  of  the  Appendix.  Required  test  time 
is  a  function  of  reliability  lower  limit  and  program  oper¬ 
ating  time  (these  constitute  the  reliability  specifications), 
x2  (value  of  Chi  Square  distribution)  and  number  of  troubles. 
Required  test  time  as  a  function  of  number  of  troubles  is 
shown  in  Figure  5  for  required  lower  confidence  limits  of  R(t)  of 
.85,  .90  and  .95  for  one  hour  of  operating  time.  These  curves 
can  be  used  to  estimate  the  amounts  of  test  Mme  required  for 
achieving  specified  reliabilities.  For  a  given  reliability 
objective,  test  time  increases  approximately  linearly  with 
number  of  troubles.  However,  test  time  increases  rapidly 
with  increases  in  reliability  objective.  For  example,  if  the 
reliability  objective  is  increased  from  .90  to  .95  for  18 
troubles,  the  reliability  requirement  increases  by  5.6  per 
cent  and  required  test  time  Increases  from  240  to  500  hours, 
or  an  increase  of  108  per  cent.  This  set  of  curves  is 
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applicable  only  to  exponential  reliability  functions. 

Required  MTBT  and  reliability  versus  number  of  troubles 
for  various  values  of  reliability  objective,  are  shown  in 
Figure  6.  The  curves  in  Figure  6  can  be  used  to  estimate  the 
MTBT  and  reliability  that  are  required,  for  a  given  number  of 
troubles,  in  order  to  satisfy  reliability  requirements.  This 
set  of  curves  is  applicable  only  to  exponential  reliability 
functions. 

V.  Results  and  Conclusions. 

Major  results  and  conclusions  of  the  the  first  phase 
of  the  research  are  given  below. 

1.  A  methodology  for  software  reliability  prediction  and 
quality  control  has  been  presented  which  could  be  imple¬ 
mented  in  an  NTDS  software  production  environment.  The 
value  of  the  methodology  is  that  it  provides  a  framework 
for  software  reliability  analysis.  The  specifics  of  the 
approach  will  probably  be  supplanted  by  an  Improved  model 
which  is  now  under  development. 

2.  Methods  have  been  described  for  estimating  the  reliability 
and  test  performance  requirements  which  are  necessary  its 
order  to  satisfy  program  reliability  objectives. 

3.  Major  factors  which  affect  software  reliability  prediction 
accuracy  are  the  heterogeneity  of  reliability  characteris¬ 
tics  among  programs  and  the  non- stationary  nature  of  the 
error  occurrence  process. 

4.  NTDS  programs  appear  to  be  heterogeneous  with  respect  to 
reliability  characteristics. 
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Reliability  prediction  and  quality  control  measures  should 
be  applied  on  an  individual  prograa/ship  combination  basis, 
due  to  the  significant  variability  in  reliability  charac¬ 
teristics  among  programs. 
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f  theoretical  frequency  (normal  distribution) 
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TABLE  A-2 


TABLE  A- 3 

Analysis  of  Variance  Results  for  Program  Run  lime 

28  Programs 


Mean  Square 

Sum  of  Squares 

df 

ss/df 

Between  Programs 

123 

27 

4.56 

Within  Programs 

558 

681 

268 

2.08 

F  *  4.56/2.08  *  2.19 

Reject  hypothesis 

of  equal  mean 

F  95(24*,120**)  ,  1.61 
F.95(24*,«)  *  1.52 

k 

Nearest  table  value. 

kk 

Highest  table  value  before  infinity. 
Legend 

df  degrees  of  freedom 

ss  sum  of  squares 

F  value  of  F  distribution 
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TABLE  A- 4 


K-S  Test  for  Exponential 

Ship  1  Program  1 

Time  Between  Troubles  Distribution 


alt) 

Pit) 

.nit) 

.10 

.16 

.06 

.50 

.40 

.10 

•  5C 

•57 

.07 

.60 

.69 

.09 

.?o 

.78 

.08 

.80 

.85 

.05 

.80 

.89 

.09 

.90 

.92 

.02 

1.0C 

.99 

.01 

N  *  10  values  of  time  between  troubles,  involving  33 
Trouble  Reports 

D(t)max  *  »10 

d10,.05  *  .409 

Accept  Exponential 


F ( t )  Theoretical  CDP 
D(t)*J.S(t)  -  P(t)| 
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TABLE  A-5 


Analysis  of  Variance  flesults  for  Program  Run  Time 
for  Program  1,  Various  Ships1 


Between  Ships 
Within  Ships 


Mean  Square 
ss/df 
1?.6 
1.50 


F  *  17.6/1.50  =  11.7  Reject  hypothesis  of  equal  means. 

P.95(8,60*)  *  2.10 
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[*T*J 


99.99 


>!•  [ii 


r#j 


Program  Run  Time  (Hours) 


3? 
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c 


Example  oil  Determining  Required  Rei  laoilitv ,  MTBT  and  'lest  Per tut mance 
Assume  desired  reliability  of  Program  1  for  Ship  i  if  given  as  follows. 


«.i; 

.95 

for 

first 

.5 

hour  of 

operation 

U' 

.90 

for 

next 

1.0 

hour  of 

operation 

(3) 

.85 

for 

next 

6.0 

hours  of 

operation 

Use  exponential  distribution  for  reliability  function.  (It  has  been 
previously  determined  that  Program  1,  Ship  1  can  be  represented  by  an 
exponential  reliability  function.) 


Lower  limit  on  MTBT  for  exponential  distribution  * 


2nt 

■e,  =  72 - 

*■  x2n,l-a 


Lower  limit  on  exponential  reliability  =* 

Rfc  =  exp(-t/T£) 


(1)  For  R£  =  .95  and  t  «  .5  hours,  .95  -  exp(-.5/T£) 

log  .95  -  --5/T£,  -.0513  »  -.5/T£ 

T.  =  9.73  hours. 

I 


(2)  For  R£  =  .90  and  t  =  1.5  hours,  .90  -  exp(~1.5/T£) 
log  .90  -  -1.5/T£,  -.1054  *  -1.5/T£ 

T£  *  14.2  hours. 


(3)  For  R£ 
log  .85 


.85  and  t  -  7.5  hours,  .85  =  exp (-7. 5  .'£) 
-7 . 5/T£ ,  -.1625  -  -7.5/T£,  =  46.1  Lours. 
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Required  additional  test  time  if  no  more  troubles  occur: 

=  li<L.l)(86)_  _  (33)  (2.94)  *  1980  -  97  =»  1883  hours 

Check:  MIBT  -  -9-7..t-r8-jp-  -  60.0  hours 
33  +  0  - 


Kejuired  reliability  ■  R  =  e  =  R 


u.'wer  reliability  limit  *  R,  =  e  -  ft"‘022t 


2. 


Requirements  If  10  Troubles  Arise  Pur lug  Future  Testing 
Now  in  example,  use  r  =  10.  Required  MTBT  * 


r  „  I*  X86,.95  _  (46.1) (1C8.6) 
r  86  86 


58.2  hours 


Required  additional  test  time  if  10  more  troubles  occur: 

t  =  Q7  m  2406  houra 

r  2  "  . . 

Check:  MTBT  =  *  58.2  houra 

Required  reliability  -  R  -  e"t/58.2S5  e~.0172t 

-  022t 

Lower  reliability  limit  *  e 

The  foregoing  calculations  are  summarized  below. 


3. 


Summary 


Requiremen1  for  Satisfying 
Reliability  Objectives 


Existing 

Zero  Future 
Troubles 

Ten  Future 
Troubles 

MTBT 

2.94  hrs 

60.0 

58*2 

MTBT 

Lower  Limit 

2.27  hrs 

46.1 

46.1 

Required 

Reliability 

-.34t 

e 

e-.ol67t 

-.0l72t 

e 

Reliability 

Lower  Limit 

-.44t 

e 

-.022t 

e 

-.022t 

£ 

Additional 

Test  Time 

— 

1883  hrs 

2406  hrs 

Total  Test  Time 

97  hr3 

1980  hrs 

2503  hrs 
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Expression  for  Estimating  Required  Test  Time 
Exponential  Reliability  Function 


Iroir  (3) 


Using  (6)  and  (7) 


Also , 


(6) 

O) 

(8) 

(9) 


This  gives  required  test  time  T  in  terms  of  lower  confidence  limit 

reliability  ,  operating  time  t,  number  of  troubles  n,  Chi  Square 

distribution  ,  and  level  of  significance  a. 

*n,l-ot 


